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Abstract — Weighted graphs obtained from co-occurrence in 
user-item relations lead to non-metric topologies. We use this 
semi-metric behavior to issue recommendations, and discuss its 
relationship to transitive closure on fuzzy graphs. Finally, we 
test the performance of this method against other item- and 
user-based recommender systems on the Movielens benchmark. 
We show that including highly semi-metric edges in our recom- 
mendation algorithms leads to better recommendations. 

Index Terms — recommender systems; complex networks; net- 
work theory (graphs); fuzzy systems 

I. Introduction: Recommendation as Prediction 

The identification of association or correlation between 
time events is important for many systems, such as: recom- 
mender systems, social behavior, functional brain interaction, 
event-detection, financial forecasting, and many more. Rec- 
ommender systems are a good example of prediction, since 
the goal is to recommend the items users may be interested 
in the future, given information about how they accessed or 
purchased items in the past fl~|. Recently, there has been much 
interest in the analysis of complex networks [2] — extracted 
from large collections of textual documents and user access 
patterns — to predict social behavior including online behavior 
||3) . In previous work, we developed complex network methods 
to uncover clusters in non-metric network topologies that arise 
in weighted graphs obtained from real-world data (e.g. via co- 
occurrence statistics, see below). Our clustering methodology, 
which is equivalent to what has become known more recently 
as link communities[4|, has been applied to social networks, 
word networks, scientific journal networks, etc [e.g.[5 |, [6]]. 

Of particular interest to prediction in recommendation, we 
have developed measures to extract the graph edges which 
most violate the triangle inequality: semi-metric associations 
(see below). Our working hypothesis is that strong semi- 
metric associations can be used to identify items with a higher 
probability of co-occurring in the future, as well the dynamics 
of such networks in general [7|. This methodology has been 
applied to recommender systems for the digital library at 
the Los Alamos National Laboratory, the givealink.org 
project, networks of felons obtained from intelligence records, 
etc. The performance of this approach was assessed using 
expert evaluations [5|. While this performance assessment 
showed that recommendations issued on the basis of semi- 
metric behavior were relevant to users, one has to worry 
about the subjectivity of human experts. Moreover, it did 
not allow us to conclude about the ability of semi-metric 



associations to predict future user choices in recommender 
systems. To address these concerns, here we use the MovieLens 
benchmark The advantage of using this benchmark is that it 
has been widely used to assess various recommender systems 
in the literature. The disadvantage is that the results are specific 
to the Movilens database on the topic of movies preferences 
only. There are other datasets, such as the one provided by 
Netflix0 which we will address in future work. Here, we 
simply want to establish, without expert subjectivity, that semi- 
metric behavior can be useful to predict future user behavior 
and thus issue quality recommendations; to achieve that goal, 
as we show below, the MovieLens benchmark is sufficient. 

II. Background 

A. Knowledge extraction in Proximity Graphs 

Our approach starts with probabilistic proximity measure 
computed from binary relations between any two sets of items 
(e.g. keywords-documents or items-users). This measure is a 
natural weighted extension [8| [9] of the Jaccard similarity 
measure [10], which has been used extensively in computa- 
tional intelligence fiTl lfl2l . Given a generic binary relation R 
between sets X (of n elements x) and Y (of m elements y), 
we extract two complementary proximity graphs: XYP and 
YXP. 
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These measures equate proximity with co-occurrence. 
xyp(xi,Xj) is the probability that both Xi and Xj are related 
(co-occur) via R to the same elements y G Y (and only 
those) — and vice-versa for yxp. Below, when we refer to a 
proximity graph P, we mean a graph obtained via formula 
[T] Other co-occurrence measures can be used to capture a 
degree of proximity between elements of two sets in a binary 
relation. In information retrieval, it is common to use the 
cosine 11131 . Euclidean lfT4l and even mutual information mea- 
sures 1151 . For characterizing closeness in relations, we prefer 
our weighted Jaccard proximity measure because it possesses 



'http://movilens.umn.edu 
2 w w w.netflix . com 



several desirable characteristics. The Euclidean measure is a 
similarity measure (it is transitive), but it generates non-sparse 
matrices, since all finite elements of the relation R lead to 
similarity greater than zero. This makes it impractical for 
very large data sets. The cosine proximity measure (which 
is typically not transitive) is scale-invariant which makes it 
very appealing for text documents of varying size, but may be 
problematic in other domains. The weighted Jaccard measure 
has aspects of both the Euclidean and the cosine measures 
lfl4l . and leads to sparse matrices. 

Proximity graphs can be seen as associative knowledge 
networks that represent how often items co-occur in a large 
set of documents Q, fl6l . The assumption is that items that 
frequently co-occur, are associated with a common concept 
understood by the community of users and writers of the 
documents. Notice that a graph of co-occurrence proximity 
allows us to capture network associations rather than just pair- 
wise co-occurrence. In other words, we expect concepts or 
themes to be organized in more interconnected sub-graphs, or 
clusters of items in the proximity networks. Indeed, we have 
successfully used the modularity of proximity networks in sev- 
eral knowledge extraction and literature mining applications, 
from recommender systems [5 1 to biomedical text mining ifTTl . 
(6). More recently, modularity-detection in proximity graph 
has been rediscovered in the literature as the idea of link 
communities JU, which applies the Jaccard similarity measure 
to graphs prior to identification of clusters. 

B. Transitive and Distance Closure 

Proximity graphs are reflexive and symmetric fuzzy graphs. 
We can perform a transitive closure of these graphs using the 
composition of their connectivity matrices, which is done in 
much the same way as the algebraic composition of matrices, 
except that multiplication and summation are substituted by 
generalized fuzzy logic conjunctions (A) and disjunction (V), 
more generally known as T-Norms and T-Conorms respec- 
tively Ifl8l . 
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where P denotes a proximity graph, and pij G [0, 1] the 
entries of its connectivity matrix. The most commonly used 
operations are A —minimum (conjunction) and V = maximum 
disjunction. But there are many large classes of such functions 
available [18]. The transitive closure P°° of a proximity graph 
P is obtained via the following algorithmJTSl : 

1) P' = PoP 

2) If P' ^ P, make P = P' and go back to step 1. 

3) Stop: P°° = P' 

The transitive closure of P yields a similarity graph. 

Instead of a proximity graph, it is often useful to work with a 
distance graph D, where dij G [0, oo], d^i — 0, dij — djj. In 
this case, instead of proximity/similarity, edge weights denote 
dissimilarity represented with the very intuitive notion of 
distance. Similarly, we can compute a distance closure, D°° to 



compute the smallest possible distance between vertices. This 
is done in exactly the same way as the transitive closure, except 
that matrix composition becomes D o D = fk(g(dik, dkj)) = 
d[p for a pair of mono tonic functions f,g, which we have 
referred to elsewhere as TD-Conorms and TD-Norms lfl9l . A 
special case of distance closure is the metric closure, where 
f(x, y) — min(x, y) and g(x, y) — x + y. This type of closure 
computes the shortest path between all edges in D — it is thus 
equivalent to the All Pairs Shortest Paths (APSP) algorithm 

& 

We can define an isomorphism between the two types of 
graphs and closures, but only by using a non-linear map ip, 
since proximity edges are constrained to [0, 1], while distance 
edges to [0, +oo] |fl9l . To establish an isomorphism (for graphs 
P and D to commute), we must guarantee: 

Vi,j G P : f{g(<p(pi,k),p(Pk,j)} = v(y{A(Pi,fe,i>fc,i)}) 
fc fc 

which leads to the equations that allow us to define the 
constraints of each operation: 

g(di,k,d k ,j) = (p(A(ip~ X {d ifk ),ip~ x (d k ,j))) 
f(di,k,d k ,j) = ip(V(ip~ 1 (d i:k ),ip~ 1 (d k>j ))) 

y(pi,k,Pkj) = (f^ 1 (f((p(p l:k ),f(pk,j))) 



^(Pi,k,Pk,j) = <p 1 (g(f(Pi,k),f(Pk :J ))) 



(2) 



This isomorphism generalizes the concept of distance in 
weighted graphs. Using different TD-Norms, TD-Conorms we 
can calculate different types of distances and shortest paths 
in weighted graphs, such as: metric distances, ultra-metric 
distances, diffusion distances among an infinity of possibilities. 
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Fig. 1. Isomorphism between the proximity and distance spaces, with their 
respective transitive and distance Closures. 



C. Semi-metric behavior 

A high value of proximity means that two items from one 
set (e.g. words) tend to co-occur frequently in another set of 
objects (e.g. web pages). But what about items that do not 
co-occur frequently with one another, but do occur frequently 
with the same other elements? In other words, even if two 



items do not co-occur much, they may occur very frequently 
with a third item (or more). Should we infer that the two items 
are related via indirect associations, that is, from transitivity! 
We would expect items that are strongly indirectly related to 
be more relevant than those that are not. 

To build up a more intuitive understanding of transitivity in 
weighted graphs, we convert our proximity graphs to distance 
graphs via isomorphism (p. The simplest proximity-to-distance 
conversion function is; 

<p ■ dij = — 1 ( 3 ) 

Pid 

A distance graph D, obtained via tp from P which is itself 
obtained from co-occurrence data in some corpus (as graphs 
XYP and YXP), does not, in general, yield an Euclidean 
topology. This is because, for a pair of elements i and j, the 
triangle inequality may be violated: dij > di,k+dk,j for some 
element k. This means that the shortest distance between two 
elements may not be the direct edge but rather an indirect 
path. Distance functions that violate the triangle inequality are 
referred to as semi-metrics (2TJ. 

Clearly, semi-metric behavior is a question of degree. For 
some pairs of vertices in a distance graph an indirect path may 
provide a much shorter indirect short-cut, a shorter distance, 
than for others. To measure a degree of semi-metric behavior 
we have introduced the semi-metric and below average ratios 
0: 
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where rj is the shortest, direct or indirect, distance between 
i and j in distance graph D, and di is the mean direct distance 
from i to all other k £ D such that d^k > 0. Sjj is positive 
and > 1 for semi-metric edges. Sjj and \ j are only applied to 
semi-metric edges dij where < d i j < dij. b measures how 
much the shortest indirect distance between i and j falls below 
the average distance of i to all its directly associated elements 
k. The below average ratio is designed to capture semi-metric 
behavior of non-finite edges: di j — > oo. Note that 6j j ^ bjj. 
b > 1 denotes a below average distance reduction (see Q for 
more details). 

III. Recommendation from Proximity Graphs 

We developed and tested two types of collaborative filtering 
algorithms: proximity- and semi-metric -based. The training set 
is a relation between users (U) and items (I) from the past 
R : U x I, where r^j = 1 if user i has accessed item j, and 
7*2 j = otherwise. This relation is a rectangular matrix of 
n X m entries. Given R, using eq. [T] we obtain user-based 
(UIP) and item-based {IUP) proximity graphs, as well as 
their isomorphic distance graphs obtained via the map of eq. 
[3] UIP {IUP) is a weighted graph of n (m) elements. Let 
us now describe our recommender algorithms based on these 
graphs: 

Algorithm 1: Item-Based Proximity 
For each user i — 1 • • • n: 



1) Retrieve the user vector [/j, containing the associated set 
of items from the training set R. 

2) From IUP remove all columns associated with items j 
such that rij = (items that do not appear in the user's 
profile from step 1). 

3) Calculate the mean value of row weights for each row in 
the reduced IU P matrix obtained in step 2. This results 
in a scalar score (in [0, 1]) for all items j = 1 • • • m. 

4) User i is recommended the top n scored items. 

Algorithm 2: Item-Based Semi-metric Same as Algorithm 
[T] except that IUP is enhanced with additional edges. We 
calculate the metric closure from the proximity relation IUP 
using the isomorphism of equation [3] From the resulting 
distance graph, we identify the semi-metric pairs (edges) with 
below average ratio bij above a given threshold, and insert 
the corresponding edges from the transitive closure of IU P°° 
into the original proximity graph {IUP). Finally we use this 
proximity graph as input for item-based proximity algorithm[T] 
Notice that IUP°° is, in this case, the isomorphic transitive 
closure to the metric closure of the distance graph. There- 
fore, the respective conjunction and disjunction operations 
employed are obtained from eq. [2] for / = min and g = +, 
given the isomorphism of eq. [3] This results in V = max and 
A = ab/(a + b — ab) (Hamacher product). 

Algorithm 3: User-Based Proximity 
For each user i — 1 ■ • • n: 

1) Determine the k nearest users to user i from proximity 
graph UIP: the k highest values of row i (neighborhood 
of user i in graph UIP). 

2) Recommend top n most frequent items among neigh- 
borhood of user i obtained in step 1. 

Algorithm 4: User-Based Semi-metric 
Here we enhance user proximity UIP with semi. metric edges, 
just like we did for IUP in algorithm [2] Afterwards, we use 
algorithm [3] 

For both semi-metric algorithms ([2] and [4]), the thresholds 
for the below average ratio were set on the distribution of bij 
around the cut-off point of the power law. 

IV. Experimental Evaluation 

1) Data Sets: We used the benchmark data set of Movie- 
Lens. This data set is a collection of votes, on a scale from 
one to five, given by web users (943 users) in respect to a 
given movie (1682 movies), as a total of 100,000 ratings. In 
our experiment, to ascertain the utility of semi-metric behavior 
to predict user behavior, we do not need to use ratings; the 
goal is to predict which (future) movies, users will rate based 
on past behavior. Therefore, we converted ratings to binary 
votes: one (rated) or zero (not-rated). 

2) Evaluation Metrics: We used the balanced Fl score, 
based on precision and recall measures, as well as variant 
of the Somers'D, the degree of agreement metric ll22l . Preci- 
sion, recall, and the Fl measures are traditional measures in 
information retrieval, computed for unranked retrieval. There 
are other assessment measures for ranked results, as the Area 



Under the Precision and Recall Curve. But since we compare 
our results to a previous benchmark effort that used the 
Somers'D measure on a set of recommender systems [23 1 |24|, 
we also use it here. Below, the measures employed are defined: 
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L+ 
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87.08 


90.99 
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49.11 
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(4) 



(5) 



(6) 



recall + precision 

where top n is the set of top n recommendations issued 
by a recommender system, and test is the set of relevant 
or expected recommendations from test set. The variant of 
Somers'D method used for the MovieLens dataset, follows 
the following procedure described in |24|. 

1) For each user we take the row vector of similarities, 
R : U x I, for each movie for the considered user. 

2) Take only the non-watched movies for this user. 

3) Rank the non-watched movies taking in consideration 
all movies. 

4) Compute the degree of agreement: consider each pair 
(a, b) of movies from recommended ranking, with a 
in the test set and b not. If a ahead of b: correct pair 
(agreement), b ahead of a: incorrect pair, [7] 

^ ^agreements 
J^total — of — pairs 

5) Compute the global degree of agreement. 

This variant of Somers'D degree of agreement gives us a 
measure of how well our set of recommendations is distributed 
in the first positions of our list of relevant items. 

V. Results 

We compare our results with the ones of Fouss et al J24|. Ta- 
ble |I] shows our results for the proximity and semi-metric (SM) 
approaches for item- and user-based recommender systems. 
Tables [EI] and III ;how the results obtained by Fouss et al in 
[24 1 for several item- and user-based recommender algorithms, 
respectively. A good description of the algorithms involved in 
this comparison can be found in Fouss lf24ll . L+ is based on 
the pseudo-inverse of the Laplacian matrix; PCA CT is based 
on the principal component analysis of L + ; kNN is based on 
the fc-nearest neighbors algorithm; Cosine is based on cosine 
similarity; Katz is based on the similarity index, which has 
been proposed in the social sciences field; and Dijkstra based 
on the shortest paths of elements of the dataset. 





Prox-ltem-based 


SM-Ilem-based 


Prox-User-based 


SM-User-based 


Agreement (in %) 


89.53 


90.16 


88.20 


88. 16 


Fl 


0.1827 


0.1832 


0.2130 


0.2179 



TABLE I 

Results for recommendation system. Somers'D degree of 
agreement itji ||24l and fl measure. 



TABLE II 

Results for item-based recommendation systems from (24). 





PCA CT 


L+ 


kNN 


Cosine 


Katz 


Dijkstra 


Agreement (in %) 


82.46 


93.02 


92.63 


92.73 


89.82 


76.09 


#Neighbors 


60 


100 


100 


60 


20 


100 



TABLE III 

Results for user-based recommendation system from [24]. 



The semi-metric approach improves the item-based proxim- 
ity method, in both Fl and the Somers'D measures (Table |T]», 
and is as good as the best item-based result reported in Fouss 
et al [24 1 (Table |H|). Notice that performance measures (on 
a fixed gold standard) are not statistical, so all improvements 
are significant. Our user-based algorithms are among the top 



such algorithms (table III i — which tend to perform better than 
item-based algorithms, table [Til though in our approach the 
reverse was observed (Table InT On our user-based approach, 
we see a slight improvement of including semi-metric edges 
with the Fl measure, but not with the Somers'D. A possible 
explanation is the fact that user-based approaches depend on 
the number of neighbors around a given user. We leave an 
analysis of the impact of number of neighbors on our user- 
based method for future work, since the objective of this 
paper is simply to show that semi-metric behavior can improve 
recommender predictions. 

VI. Discussion and Conclusions 

We show that exploring the natural clustering of proximity 
graphs (equations 1), leads to very simple, but competitive 
item- and user-based recommender systems, in comparison 
to previous benchmarks in the literature 11241 . Enhancing 
proximity graphs with semi-metric edges further improves 
recommendations, confirming the previous evidence in Rocha 
et al [|5j; on the item-based approach we see an improvement 
in both Fl and Somers'D measures, while on the user-based 
approach we see it only on the Fl measure. This improvement 
is not dramatic, but shows that semi-metric edges can be 
used to enhance prediction in recommender systems. Since 
we barely scratched the surface of understanding semi-metric 
behavior in complex networks, the approach is promising 
leaving plenty of room to improve the basic algorithms we 
introduced here. 
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